A Resource for Natural Language Processing of Swiss German Dialects
نویسندگان
چکیده
Since there are only a few resources for Swiss German dialects, we compiled a corpus of 115,000 tokens, manually annotated with PoStags. The goal is to provide a basic data set for developing NLP applications for Swiss German. We extended the original corpus and improved its annotation consistency. Furthermore, we trained dialect-specific PoS-tagging models and implemented a baseline system for dialect identification.
منابع مشابه
Compilation of a Swiss German Dialect Corpus and its Application to PoS Tagging
Swiss German is a dialect continuum whose dialects are very different from Standard German, the official language of the German part of Switzerland. However, dealing with Swiss German in natural language processing, usually the detour through Standard German is taken. As writing in Swiss German has become more and more popular in recent years, we would like to provide data to serve as a steppin...
متن کاملSyntactic transformations for Swiss German dialects
While most dialectological research so far focuses on phonetic and lexical phenomena, we use recent fieldwork in the domain of dialect syntax to guide the development of multidialectal natural language processing tools. In particular, we develop a set of rules that transform Standard German sentence structures into syntactically valid Swiss German sentence structures. These rules are sensitive ...
متن کاملContinuous variation in computational morphology - the example of Swiss German
Most work in natural language processing is geared towards written, standardized language varieties. This focus is generally justified on practical grounds of data availability and socio-economical relevance, but does not always reflect the linguistic reality of sub-standard varieties. In this paper, we aim at the computational description of the morphology of a language with continuous interna...
متن کاملMorphological analysis and lemmatization for Swiss German using weighted transducers
With written Swiss German becoming more popular in everyday use, it has become a target for text processing. The absence of a standard orthography and the variety of dialects, however, lead to a vast variation in different spellings which makes this task difficult. We built a system based on weighted transducers that recognizes over 90% of the tokens in certain texts. Weights ensure preferring ...
متن کاملDeclarative sentence intonation patterns in 8 swiss German dialects
This study examines declarative sentence intonation contours in 8 vastly different Swiss German dialects by the application of the Command-Response model. Fundamental frequency patterns of a controlled declarative sentence are analyzed on the global and local level of intonation. The results provide evidence of a different patterning for the dialects in the context of how global and local level...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015